CREST – Classification Resources for Environmental Sequence Tags
نویسندگان
چکیده
Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU) ribosomal RNA gene, and an increased understanding of microbial phylogeny, diversity and community composition patterns. However, to utilize these large datasets together with new sequencing technologies, a reliable and flexible system for taxonomic classification is critical. We developed CREST (Classification Resources for Environmental Sequence Tags), a set of resources and tools for generating and utilizing custom taxonomies and reference datasets for classification of environmental sequences. CREST uses an alignment-based classification method with the lowest common ancestor algorithm. It also uses explicit rank similarity criteria to reduce false positives and identify novel taxa. We implemented this method in a web server, a command line tool and the graphical user interfaced program MEGAN. Further, we provide the SSU rRNA reference database and taxonomy SilvaMod, derived from the publicly available SILVA SSURef, for classification of sequences from bacteria, archaea and eukaryotes. Using cross-validation and environmental datasets, we compared the performance of CREST and SilvaMod to the RDP Classifier. We also utilized Greengenes as a reference database, both with CREST and the RDP Classifier. These analyses indicate that CREST performs better than alignment-free methods with higher recall rate (sensitivity) as well as precision, and with the ability to accurately identify most sequences from novel taxa. Classification using SilvaMod performed better than with Greengenes, particularly when applied to environmental sequences. CREST is freely available under a GNU General Public License (v3) from http://apps.cbu.uib.no/crest and http://lcaclassifier.googlecode.com.
منابع مشابه
P-215: Discovery of A Novel APA Variant of A Human Potential Gene Based on Expressed Sequenced Tags Analysis
Background: Expressed sequence tags (ESTs) are sequences of cDNA fragments prepared from different tissue sources. There are over one million of these sequences in the publicly available database, and these sequences are believed to represent more than half of all human genes. The ESTs belong to different cDNA libraries, was prepared from one particular cell type, organ, or tumor. Therefore, th...
متن کاملEvaluating tag filtering techniques for web resource classification in folksonomies
Social or collaborative tagging systems emerged as a novel classification scheme on the Web based on the collective knowledge of people. In sites such as Del.icio.us, Technorati or Flickr, users annotate a variety of resources, including Web pages, blogs, pictures, videos or bibliographic references; using freely chosen textual labels or tags. Underlying collaborative tagging systems are ternar...
متن کاملIdentification of Drought-Responsive Universal Stress Proteins in Viridiplantae
Genes encoding proteins that contain the universal stress protein (USP) domain are known to provide bacteria, archaea, fungi, protozoa, and plants with the ability to respond to a plethora of environmental stresses. Specifically in plants, drought tolerance is a desirable phenotype. However, limited focused and organized functional genomic datasets exist on drought-responsive plant USP genes to...
متن کاملEvaluation of Land Cover Changes Ysing Remote Sensing Technique (Case study: Hableh Rood Subwatershed of Shahrabad Basin)
The growing population and increasing socio-economic necessities creates a pressure on land use/land cover. Nowadays, land use change detection using remote sensing data provides quantitative and timely information for management and evaluation of natural resources. This study investigates the land use changes in part of Hableh Rood Watershed of Iran using Landsat 7 and 8 (Sensor ETM+ and OLI) ...
متن کاملOrganizing Resources on Tagging Systems using T-ORG
Tagging systems (or folksonomies) like Flickr or Delicious are expanding tremendously. More and more resources are being added to them. As the resources present on these system increase in amount, it becomes difficult to explore these resources. For this purpose, we present a system T-ORG, which provides a mechanism to organize these resources by classifying the tags (or keywords) attached to t...
متن کامل